\section*{Appendix}

\input{./sections/3.tex}


\subsection*{Proof of Lemma~\ref{seedlemma}}

\begin{proof}

We again prove by contradiction. Let us assume such seed does not exist, then
each seed will either have at least one error in the non-overlapping region
or have at least have two errors in the overlapping region or have a single
error in the left overlapping region.

According to Theorem 1, there must exist at least one seed with an intact
non-overlapping region and a single error in the left overlapping region. A
seed having an error in its left overlapping region also implies that its left
neighbor seed has an error in the right overlapping region. Since there exists
no seed that only contains a single error in the right overlapping region as we
assumed, the left neighbor seed has to have at least one error either in the
non-overlapping region or in the left overlapping region. If the error is in
the non-overlapping region, then we have formed a \textbf{complete chain},
which is defined as a chain of seeds which is linked together by errors in the
common overlapping regions. A complete chain links as many as seeds as possible
until it reaches to a seed who has no error in the overlapping region and
breaks the link. If the error is in the left overlapping region, however, the
two seeds will not form a complete chain, as the error in the left overlapping
region links yet another seed. The link continues until it either reaches to a
seed which has no error in the left overlapping region of that seed or it
reaches the left most seed in the read which has no left overlapping region at
all. However, as the chain grows, each time when it adds one more seed, it also
includes at least one more errors, resulting at least one error per seed in the
chain.

By definition, any seed without error in the right overlapping region will be
the right end of a chain. Likewise, any seed without error in the left
overlapping region will be the left end of a chain. A seed with no error in
the overlapping region will be a complete chain by itself. Seeds without
overlapping regions, such as the left most and the right most seeds of the read,
will also be the ends of complete chains. Each complete chain can have at most a
single left seed end and a single right end seed. The left end of a complete chain may freely overlaps with
the right end of another complete chain in the read as long as they do not share errors at the
overlapping region.

With our assumptions, a read can only contain two kinds of chains: the first
kind contains seeds which either has at least an error in the non-overlapping
region or two errors in the overlapping regions as we described in
Theorem~\ref{overlap}; and the second kind is the same with the first kind
except that the right most seed of the chain only has a single error in the
left overlapping region. Nonetheless, both kinds of chains require at least one
error per seed, requiring $e+1$ errors for $e+1$ seeds, contradicts to our
given condition.

\end{proof}






